Timeout for PDF extraction from OpenOffice supported document format. by vrybas · Pull Request #34 · documentcloud/docsplit

vrybas · 2012-01-13T16:31:51Z

Because when we extract_pdf() from document more than 400-500 pages,
the JODConverter fails with exception:

Exception in thread "main" org.artofsolving.jodconverter.office.OfficeException: task did not complete within timeout at org.artofsolving.jodconverter.office.PooledOfficeManager.execute(PooledOfficeManager.java:88) at
org.artofsolving.jodconverter.office.ProcessPoolOfficeManager.execute(ProcessPoolOfficeManager.java:78) at org.artofsolving.jodconverter.OfficeDocumentConverter.convert(OfficeDocumentConverter.java:78) at org.artofsolving.jodconverter.OfficeDocumentConverter.convert(OfficeDocumentConverter.java:69) at org.artofsolving.jodconverter.cli.Convert.main(Convert.java:118) Caused by: java.util.concurrent.TimeoutException at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:228) at java.util.concurrent.FutureTask.get(FutureTask.java:91) at org.artofsolving.jodconverter.office.PooledOfficeManager.execute(PooledOfficeManager.java:85) ...

The new JODConverter 3.0b4 getting timeout param. The problem is solved.

I don't know if timeout should be hardcoded, or if it should be documented Docsplit's option. I did both in separate commits.

Because when we extract_pdf() document more than 400-500 pages, the JODConverter fails with exception: Exception in thread "main" org.artofsolving.jodconverter.office.OfficeException: task did not complete within timeout at org.artofsolving.jodconverter.office.PooledOfficeManager.execute...

tienle · 2012-05-07T04:27:32Z

vote for supporting timeout option. 👍

jravetch · 2012-05-17T07:47:49Z

Me too. Have run into the issue before. Very heavy docs can take almost 5min to convert to pdf.

mromaine · 2012-05-21T10:05:02Z

+1 here too; what are the chances this pull request will be granted?

pzaich · 2012-11-01T03:07:10Z

+1 Has this been resolved yet? I am running into this problem as well. Anything over 1.5 mB on .doc format seems to timeout along with a lot of pdfs.

alxndrmlr · 2013-06-19T13:09:47Z

lib/docsplit/command_line.rb

Perhaps change this message to "Timeout for PDF extraction from OpenOffice supported document format" so as not to lead people into thinking the flag will only apply to OpenOffice files and not .doc, .xlsx

@alxndrmlr, will do thanks

Original work by documentcloud#34 with modification to not use a default timeout (causing no change from existing functionality).

vrybas added 2 commits January 13, 2012 22:59

Command line parameter and documentation update for timeout option

51b6f7d

alxndrmlr reviewed Jun 19, 2013
View reviewed changes

Fixed help message for --timeout option

e52a17d

doxavore pushed a commit to ebp/docsplit that referenced this pull request Apr 25, 2014

Add timeout option to JODConverter.

856b122

Original work by documentcloud#34 with modification to not use a default timeout (causing no change from existing functionality).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Timeout for PDF extraction from OpenOffice supported document format.#34

Timeout for PDF extraction from OpenOffice supported document format.#34
vrybas wants to merge 3 commits intodocumentcloud:masterfrom
vrybas:timeout_option

vrybas commented Jan 13, 2012

Uh oh!

tienle commented May 7, 2012

Uh oh!

jravetch commented May 17, 2012

Uh oh!

mromaine commented May 21, 2012

Uh oh!

pzaich commented Nov 1, 2012

Uh oh!

alxndrmlr Jun 19, 2013

Uh oh!

vrybas Jun 19, 2013

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

vrybas commented Jan 13, 2012

Uh oh!

tienle commented May 7, 2012

Uh oh!

jravetch commented May 17, 2012

Uh oh!

mromaine commented May 21, 2012

Uh oh!

pzaich commented Nov 1, 2012

Uh oh!

alxndrmlr Jun 19, 2013

Choose a reason for hiding this comment

Uh oh!

vrybas Jun 19, 2013

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants